Comments for MEDB 5502, Week 05

Topics to be covered

  • What you will learn
    • Principal components analysis
    • Applications of principal components
    • Factor analysis
    • Criticisms of principal components analysis and factor analysis

Philosophy behind principal components, 1 of 4

  • Reduce complexity by modeling inter-relationships
  • Inter-rleationships are linear
    • There is no dependent or outcome variable in principal components analysis

Philosophy behind principal components, 2 of 4

  • First principal component
    • Linear combination that accounts most variation
    • This linear combination is the first eigenvector
    • The amount of variation accounted for is the first eigenvalue

Philosophy behind principal components, 3 of 4

  • Need to resolve an ambiguity
    • \(3X_1+5X_2-4X_3+7X_4-1X_5\) versus \(6X_1+10X_2-8X_3+14X_4-2X_5\)
    • Solution: require sum of squared coefficients to equal 1
    • Note: \(3^2+5^2+(-4)^2+7^2+(-1)^2=100\)
    • Use \(\frac{3}{10}X_1+\frac{5}{10}X_2-\frac{4}{10}X_3+\frac{7}{10}X_4-\frac{1}{10}X_5\)

Philosophy behind principal components, 4 of 4

  • Second principal component
    • Linear combination that accounts second most variation
    • Must be uncorrelated with first principal component
    • This linear combination is the second eigenvector
    • The amount of variation accounted for is the second eigenvalue
  • Third principal component defined similarly

Covariance matrix or correlation matrix

  • Correlation matrix equivalent to standardizing
    • Absolute requirement if differing units
  • Covariance matrix de-emphasizes low variance variables

How many components?

  • Percentage of variation accounted for
    • Scree plot
  • Eigenvalues > 1
  • Researcher preference/convenience

Communality

  • Amount of shared variation
    • Always between 0 and 1
    • Similar interpretation to R-squared
    • “One of these things is not like the others”

Factor score matrix

  • Linear combination coefficients
  • Needed if you score by hand
  • No obvious interpretation
  • First component is often only positive values

Component matrix

  • Interpret as correlation matrix
    • Rows are individual variables
    • Columns are principal components

Correlation matrix, 1 of 3

Correlation matrix, 2 of 3

Correlation matrix, 3 of 3

Communalities

Eigenvalues

Scree plot

Component matrix

Live demo, Principal components analysis

Break #1

  • What you have learned
    • Principal components analysis
  • What’s coming next
    • Applications of principal components

Applications

  • Visualization
    • Reduce high dimensional visualization
    • Fewer graphs
  • Regression analysis
    • Fewer independent variables (rule of 15)
    • Removes collinearity

Boxplots of first four principal components

Scatterplot of first four principal components

R-squared using four principal components

R-squared using all 24 variables

Live demo, Applications of principal components

Break #2

  • What you have learned
    • Applications of principal components
  • What’s coming next
    • Factor analysis

Philosophy behind factor analysis

  • Variance equals information
  • Covariance (correlation) equals shared information
  • Modeling shared information creates latent variables

Factor rotation

  • Recombine factors
  • Strive for simple interpretation
    • Components close to -1, 0, or 1
    • Each variable has one and only one non-zero components
    • Not always achievable

Rotated factor pattern, 1 of 3

Rotated factor pattern, 2 of 3

Rotated factor pattern, 3 of 3

Live demo, Factor analysis

Break #3

  • What you have learned
    • Factor analysis
  • What’s coming next
    • Criticisms of principal components analysis and factor analysis

Criticisms of principal components analysis

  • Advantages
    • Makes collection of many variables manageable
    • Eliminates collinearity issues
    • Focus only on important sources of variation
  • Disadvantages
    • Components often uninterpretable
    • False sense of parsimony

Criticisms of factor analysis

  • Advantages
    • Explore underlying structure
    • Create or validate subscales
  • Disadvantages
    • Difficulty in choosing number of factors
    • Reification

Summary

  • What you have learned
    • Principal components analysis
    • Applications of principal components
    • Factor analysis
    • Criticisms of principal components analysis and factor analysis

Additional topics??